Overview

Dataset statistics

Number of variables34
Number of observations2344823
Missing cells0
Missing cells (%)0.0%
Duplicate rows6
Duplicate rows (%)< 0.1%
Total size in memory322.0 MiB
Average record size in memory144.0 B

Variable types

Numeric8
Categorical26

Alerts

Dataset has 6 (< 0.1%) duplicate rowsDuplicates
IN_TREINEIRO is highly overall correlated with TP_FAIXA_ETARIA and 1 other fieldsHigh correlation
Q001 is highly overall correlated with Q002 and 1 other fieldsHigh correlation
Q002 is highly overall correlated with Q001 and 1 other fieldsHigh correlation
Q003 is highly overall correlated with Q001High correlation
Q004 is highly overall correlated with Q002High correlation
Q006 is highly overall correlated with Q018High correlation
Q018 is highly overall correlated with Q006High correlation
TP_ESCOLA is highly overall correlated with TP_ST_CONCLUSAOHigh correlation
TP_FAIXA_ETARIA is highly overall correlated with IN_TREINEIROHigh correlation
TP_ST_CONCLUSAO is highly overall correlated with IN_TREINEIRO and 1 other fieldsHigh correlation
TP_ESTADO_CIVIL is highly imbalanced (78.1%)Imbalance
Q007 is highly imbalanced (70.4%)Imbalance
Q011 is highly imbalanced (58.6%)Imbalance
Q012 is highly imbalanced (80.3%)Imbalance
Q014 is highly imbalanced (56.2%)Imbalance
Q015 is highly imbalanced (73.9%)Imbalance
Q016 is highly imbalanced (54.3%)Imbalance
Q017 is highly imbalanced (89.6%)Imbalance
Q025 is highly imbalanced (59.6%)Imbalance
TP_COR_RACA has 40871 (1.7%) zerosZeros

Reproduction

Analysis started2024-04-15 00:59:28.309207
Analysis finished2024-04-15 01:03:21.081847
Duration3 minutes and 52.77 seconds
Software versionydata-profiling vv4.7.0
Download configurationconfig.json

Variables

TP_FAIXA_ETARIA
Real number (ℝ)

HIGH CORRELATION 

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.2380606
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 MiB
2024-04-14T22:03:21.142902image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile12
Maximum20
Range19
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.3427183
Coefficient of variation (CV)0.78873774
Kurtosis2.2625056
Mean4.2380606
Median Absolute Deviation (MAD)1
Skewness1.6837301
Sum9937502
Variance11.173766
MonotonicityNot monotonic
2024-04-14T22:03:21.260008image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
3 593355
25.3%
2 576153
24.6%
4 269293
11.5%
1 247749
10.6%
5 151508
 
6.5%
6 95792
 
4.1%
11 88159
 
3.8%
7 67515
 
2.9%
8 49832
 
2.1%
12 47083
 
2.0%
Other values (10) 158384
 
6.8%
ValueCountFrequency (%)
1 247749
10.6%
2 576153
24.6%
3 593355
25.3%
4 269293
11.5%
5 151508
 
6.5%
6 95792
 
4.1%
7 67515
 
2.9%
8 49832
 
2.1%
9 36867
 
1.6%
10 30176
 
1.3%
ValueCountFrequency (%)
20 315
 
< 0.1%
19 852
 
< 0.1%
18 2090
 
0.1%
17 5149
 
0.2%
16 9309
 
0.4%
15 15339
 
0.7%
14 23948
 
1.0%
13 34339
 
1.5%
12 47083
2.0%
11 88159
3.8%

TP_SEXO
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
0
1436668 
1
908155 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1436668
61.3%
1 908155
38.7%

Length

2024-04-14T22:03:21.405140image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:21.503230image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 1436668
61.3%
1 908155
38.7%

Most occurring characters

ValueCountFrequency (%)
0 1436668
61.3%
1 908155
38.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1436668
61.3%
1 908155
38.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1436668
61.3%
1 908155
38.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1436668
61.3%
1 908155
38.7%

TP_ESTADO_CIVIL
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
2164815 
2
 
78251
0
 
73063
3
 
26871
4
 
1823

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 2164815
92.3%
2 78251
 
3.3%
0 73063
 
3.1%
3 26871
 
1.1%
4 1823
 
0.1%

Length

2024-04-14T22:03:21.604322image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:21.690400image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 2164815
92.3%
2 78251
 
3.3%
0 73063
 
3.1%
3 26871
 
1.1%
4 1823
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1 2164815
92.3%
2 78251
 
3.3%
0 73063
 
3.1%
3 26871
 
1.1%
4 1823
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 2164815
92.3%
2 78251
 
3.3%
0 73063
 
3.1%
3 26871
 
1.1%
4 1823
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 2164815
92.3%
2 78251
 
3.3%
0 73063
 
3.1%
3 26871
 
1.1%
4 1823
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 2164815
92.3%
2 78251
 
3.3%
0 73063
 
3.1%
3 26871
 
1.1%
4 1823
 
0.1%

TP_COR_RACA
Real number (ℝ)

ZEROS 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.991332
Minimum0
Maximum5
Zeros40871
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size8.9 MiB
2024-04-14T22:03:21.785487image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q33
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.0184765
Coefficient of variation (CV)0.51145492
Kurtosis-1.3097986
Mean1.991332
Median Absolute Deviation (MAD)1
Skewness0.1328009
Sum4669321
Variance1.0372944
MonotonicityNot monotonic
2024-04-14T22:03:21.887579image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 1026418
43.8%
3 966698
41.2%
2 255863
 
10.9%
4 43782
 
1.9%
0 40871
 
1.7%
5 11191
 
0.5%
ValueCountFrequency (%)
0 40871
 
1.7%
1 1026418
43.8%
2 255863
 
10.9%
3 966698
41.2%
4 43782
 
1.9%
5 11191
 
0.5%
ValueCountFrequency (%)
5 11191
 
0.5%
4 43782
 
1.9%
3 966698
41.2%
2 255863
 
10.9%
1 1026418
43.8%
0 40871
 
1.7%

TP_ST_CONCLUSAO
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
963119 
2
957731 
3
417070 
4
 
6903

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
1 963119
41.1%
2 957731
40.8%
3 417070
17.8%
4 6903
 
0.3%

Length

2024-04-14T22:03:21.974659image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:22.057734image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 963119
41.1%
2 957731
40.8%
3 417070
17.8%
4 6903
 
0.3%

Most occurring characters

ValueCountFrequency (%)
1 963119
41.1%
2 957731
40.8%
3 417070
17.8%
4 6903
 
0.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 963119
41.1%
2 957731
40.8%
3 417070
17.8%
4 6903
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 963119
41.1%
2 957731
40.8%
3 417070
17.8%
4 6903
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 963119
41.1%
2 957731
40.8%
3 417070
17.8%
4 6903
 
0.3%

TP_ESCOLA
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
1387092 
2
760853 
3
196878 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row3

Common Values

ValueCountFrequency (%)
1 1387092
59.2%
2 760853
32.4%
3 196878
 
8.4%

Length

2024-04-14T22:03:22.143811image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:22.219882image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1387092
59.2%
2 760853
32.4%
3 196878
 
8.4%

Most occurring characters

ValueCountFrequency (%)
1 1387092
59.2%
2 760853
32.4%
3 196878
 
8.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1387092
59.2%
2 760853
32.4%
3 196878
 
8.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1387092
59.2%
2 760853
32.4%
3 196878
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1387092
59.2%
2 760853
32.4%
3 196878
 
8.4%

IN_TREINEIRO
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
0
1927753 
1
417070 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1927753
82.2%
1 417070
 
17.8%

Length

2024-04-14T22:03:22.312965image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:22.408052image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 1927753
82.2%
1 417070
 
17.8%

Most occurring characters

ValueCountFrequency (%)
0 1927753
82.2%
1 417070
 
17.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1927753
82.2%
1 417070
 
17.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1927753
82.2%
1 417070
 
17.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1927753
82.2%
1 417070
 
17.8%

TP_LINGUA
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
0
1357622 
1
987201 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 1357622
57.9%
1 987201
42.1%

Length

2024-04-14T22:03:22.506141image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:22.595222image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0 1357622
57.9%
1 987201
42.1%

Most occurring characters

ValueCountFrequency (%)
0 1357622
57.9%
1 987201
42.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1357622
57.9%
1 987201
42.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1357622
57.9%
1 987201
42.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1357622
57.9%
1 987201
42.1%

Q001
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5935045
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 MiB
2024-04-14T22:03:22.690308image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q36
95-th percentile8
Maximum8
Range7
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.8760278
Coefficient of variation (CV)0.40840883
Kurtosis-0.75607041
Mean4.5935045
Median Absolute Deviation (MAD)1
Skewness0.03745095
Sum10770955
Variance3.5194803
MonotonicityNot monotonic
2024-04-14T22:03:22.793402image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
5 720913
30.7%
2 350284
14.9%
3 291859
12.4%
4 259673
 
11.1%
6 250540
 
10.7%
8 202637
 
8.6%
7 193050
 
8.2%
1 75867
 
3.2%
ValueCountFrequency (%)
1 75867
 
3.2%
2 350284
14.9%
3 291859
12.4%
4 259673
 
11.1%
5 720913
30.7%
6 250540
 
10.7%
7 193050
 
8.2%
8 202637
 
8.6%
ValueCountFrequency (%)
8 202637
 
8.6%
7 193050
 
8.2%
6 250540
 
10.7%
5 720913
30.7%
4 259673
 
11.1%
3 291859
12.4%
2 350284
14.9%
1 75867
 
3.2%

Q002
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.8018234
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 MiB
2024-04-14T22:03:22.899498image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median5
Q36
95-th percentile7
Maximum8
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.6215216
Coefficient of variation (CV)0.33768873
Kurtosis-0.43221959
Mean4.8018234
Median Absolute Deviation (MAD)1
Skewness-0.32283434
Sum11259426
Variance2.6293324
MonotonicityNot monotonic
2024-04-14T22:03:23.009598image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
5 851162
36.3%
6 329384
 
14.0%
7 322603
 
13.8%
4 262334
 
11.2%
2 238683
 
10.2%
3 232165
 
9.9%
8 62486
 
2.7%
1 46006
 
2.0%
ValueCountFrequency (%)
1 46006
 
2.0%
2 238683
 
10.2%
3 232165
 
9.9%
4 262334
 
11.2%
5 851162
36.3%
6 329384
 
14.0%
7 322603
 
13.8%
8 62486
 
2.7%
ValueCountFrequency (%)
8 62486
 
2.7%
7 322603
 
13.8%
6 329384
 
14.0%
5 851162
36.3%
4 262334
 
11.2%
3 232165
 
9.9%
2 238683
 
10.2%
1 46006
 
2.0%

Q003
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.2131082
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 MiB
2024-04-14T22:03:23.114694image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5404451
Coefficient of variation (CV)0.47942521
Kurtosis-0.86074084
Mean3.2131082
Median Absolute Deviation (MAD)1
Skewness0.24137501
Sum7534170
Variance2.372971
MonotonicityNot monotonic
2024-04-14T22:03:23.195768image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
3 531108
22.7%
4 525019
22.4%
2 437711
18.7%
1 387595
16.5%
6 260803
11.1%
5 202587
 
8.6%
ValueCountFrequency (%)
1 387595
16.5%
2 437711
18.7%
3 531108
22.7%
4 525019
22.4%
5 202587
 
8.6%
6 260803
11.1%
ValueCountFrequency (%)
6 260803
11.1%
5 202587
 
8.6%
4 525019
22.4%
3 531108
22.7%
2 437711
18.7%
1 387595
16.5%

Q004
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.0045539
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 MiB
2024-04-14T22:03:23.275840image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q34
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4813366
Coefficient of variation (CV)0.49303048
Kurtosis-0.81308274
Mean3.0045539
Median Absolute Deviation (MAD)1
Skewness0.48345606
Sum7045147
Variance2.1943582
MonotonicityNot monotonic
2024-04-14T22:03:23.353911image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2 902602
38.5%
4 645302
27.5%
1 309181
 
13.2%
6 196040
 
8.4%
5 149110
 
6.4%
3 142588
 
6.1%
ValueCountFrequency (%)
1 309181
 
13.2%
2 902602
38.5%
3 142588
 
6.1%
4 645302
27.5%
5 149110
 
6.4%
6 196040
 
8.4%
ValueCountFrequency (%)
6 196040
 
8.4%
5 149110
 
6.4%
4 645302
27.5%
3 142588
 
6.1%
2 902602
38.5%
1 309181
 
13.2%

Q005
Categorical

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
4
822269 
3
660095 
5
352232 
2
284229 
6
111907 
Other values (15)
114091 

Length

Max length2
Median length1
Mean length1.0026625
Min length1

Characters and Unicode

Total characters2351066
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row5
4th row2
5th row4

Common Values

ValueCountFrequency (%)
4 822269
35.1%
3 660095
28.2%
5 352232
15.0%
2 284229
 
12.1%
6 111907
 
4.8%
1 47123
 
2.0%
7 39124
 
1.7%
8 15733
 
0.7%
9 5868
 
0.3%
10 3316
 
0.1%
Other values (10) 2927
 
0.1%

Length

2024-04-14T22:03:23.441992image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4 822269
35.1%
3 660095
28.2%
5 352232
15.0%
2 284229
 
12.1%
6 111907
 
4.8%
1 47123
 
2.0%
7 39124
 
1.7%
8 15733
 
0.7%
9 5868
 
0.3%
10 3316
 
0.1%
Other values (10) 2927
 
0.1%

Most occurring characters

ValueCountFrequency (%)
4 822459
35.0%
3 660418
28.1%
5 352391
15.0%
2 285189
 
12.1%
6 111957
 
4.8%
1 54345
 
2.3%
7 39167
 
1.7%
8 15757
 
0.7%
9 5895
 
0.3%
0 3488
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2351066
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 822459
35.0%
3 660418
28.1%
5 352391
15.0%
2 285189
 
12.1%
6 111957
 
4.8%
1 54345
 
2.3%
7 39167
 
1.7%
8 15757
 
0.7%
9 5895
 
0.3%
0 3488
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2351066
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 822459
35.0%
3 660418
28.1%
5 352391
15.0%
2 285189
 
12.1%
6 111957
 
4.8%
1 54345
 
2.3%
7 39167
 
1.7%
8 15757
 
0.7%
9 5895
 
0.3%
0 3488
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2351066
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 822459
35.0%
3 660418
28.1%
5 352391
15.0%
2 285189
 
12.1%
6 111957
 
4.8%
1 54345
 
2.3%
7 39167
 
1.7%
8 15757
 
0.7%
9 5895
 
0.3%
0 3488
 
0.1%

Q006
Real number (ℝ)

HIGH CORRELATION 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.0370689
Minimum1
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 MiB
2024-04-14T22:03:23.517060image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q37
95-th percentile14
Maximum17
Range16
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.7950422
Coefficient of variation (CV)0.75342273
Kurtosis1.4457602
Mean5.0370689
Median Absolute Deviation (MAD)2
Skewness1.4291988
Sum11811035
Variance14.402345
MonotonicityNot monotonic
2024-04-14T22:03:23.605140image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
2 630492
26.9%
3 369704
15.8%
4 276804
11.8%
5 194527
 
8.3%
8 145866
 
6.2%
7 145812
 
6.2%
1 119268
 
5.1%
6 115203
 
4.9%
9 62344
 
2.7%
10 44115
 
1.9%
Other values (7) 240688
 
10.3%
ValueCountFrequency (%)
1 119268
 
5.1%
2 630492
26.9%
3 369704
15.8%
4 276804
11.8%
5 194527
 
8.3%
6 115203
 
4.9%
7 145812
 
6.2%
8 145866
 
6.2%
9 62344
 
2.7%
10 44115
 
1.9%
ValueCountFrequency (%)
17 39060
 
1.7%
16 29282
 
1.2%
15 31826
 
1.4%
14 28365
 
1.2%
13 39300
 
1.7%
12 41407
 
1.8%
11 31448
 
1.3%
10 44115
 
1.9%
9 62344
2.7%
8 145866
6.2%

Q007
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
2117906 
2
 
122670
4
 
76675
3
 
27572

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 2117906
90.3%
2 122670
 
5.2%
4 76675
 
3.3%
3 27572
 
1.2%

Length

2024-04-14T22:03:23.702227image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:23.777296image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 2117906
90.3%
2 122670
 
5.2%
4 76675
 
3.3%
3 27572
 
1.2%

Most occurring characters

ValueCountFrequency (%)
1 2117906
90.3%
2 122670
 
5.2%
4 76675
 
3.3%
3 27572
 
1.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 2117906
90.3%
2 122670
 
5.2%
4 76675
 
3.3%
3 27572
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 2117906
90.3%
2 122670
 
5.2%
4 76675
 
3.3%
3 27572
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 2117906
90.3%
2 122670
 
5.2%
4 76675
 
3.3%
3 27572
 
1.2%

Q008
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
2
1417203 
3
606569 
4
197448 
5
 
109713
1
 
13890

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 1417203
60.4%
3 606569
25.9%
4 197448
 
8.4%
5 109713
 
4.7%
1 13890
 
0.6%

Length

2024-04-14T22:03:23.859371image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:23.936441image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
2 1417203
60.4%
3 606569
25.9%
4 197448
 
8.4%
5 109713
 
4.7%
1 13890
 
0.6%

Most occurring characters

ValueCountFrequency (%)
2 1417203
60.4%
3 606569
25.9%
4 197448
 
8.4%
5 109713
 
4.7%
1 13890
 
0.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 1417203
60.4%
3 606569
25.9%
4 197448
 
8.4%
5 109713
 
4.7%
1 13890
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 1417203
60.4%
3 606569
25.9%
4 197448
 
8.4%
5 109713
 
4.7%
1 13890
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 1417203
60.4%
3 606569
25.9%
4 197448
 
8.4%
5 109713
 
4.7%
1 13890
 
0.6%

Q009
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
3
1130738 
4
823909 
2
227697 
5
149202 
1
 
13277

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row4
3rd row3
4th row2
5th row3

Common Values

ValueCountFrequency (%)
3 1130738
48.2%
4 823909
35.1%
2 227697
 
9.7%
5 149202
 
6.4%
1 13277
 
0.6%

Length

2024-04-14T22:03:24.025521image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:24.111600image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
3 1130738
48.2%
4 823909
35.1%
2 227697
 
9.7%
5 149202
 
6.4%
1 13277
 
0.6%

Most occurring characters

ValueCountFrequency (%)
3 1130738
48.2%
4 823909
35.1%
2 227697
 
9.7%
5 149202
 
6.4%
1 13277
 
0.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 1130738
48.2%
4 823909
35.1%
2 227697
 
9.7%
5 149202
 
6.4%
1 13277
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 1130738
48.2%
4 823909
35.1%
2 227697
 
9.7%
5 149202
 
6.4%
1 13277
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 1130738
48.2%
4 823909
35.1%
2 227697
 
9.7%
5 149202
 
6.4%
1 13277
 
0.6%

Q010
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
1100765 
2
959173 
3
249117 
4
 
29488
5
 
6280

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1100765
46.9%
2 959173
40.9%
3 249117
 
10.6%
4 29488
 
1.3%
5 6280
 
0.3%

Length

2024-04-14T22:03:24.204684image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:24.284758image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1100765
46.9%
2 959173
40.9%
3 249117
 
10.6%
4 29488
 
1.3%
5 6280
 
0.3%

Most occurring characters

ValueCountFrequency (%)
1 1100765
46.9%
2 959173
40.9%
3 249117
 
10.6%
4 29488
 
1.3%
5 6280
 
0.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1100765
46.9%
2 959173
40.9%
3 249117
 
10.6%
4 29488
 
1.3%
5 6280
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1100765
46.9%
2 959173
40.9%
3 249117
 
10.6%
4 29488
 
1.3%
5 6280
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1100765
46.9%
2 959173
40.9%
3 249117
 
10.6%
4 29488
 
1.3%
5 6280
 
0.3%

Q011
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
1752408 
2
524529 
3
 
61037
4
 
5759
5
 
1090

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
1 1752408
74.7%
2 524529
 
22.4%
3 61037
 
2.6%
4 5759
 
0.2%
5 1090
 
< 0.1%

Length

2024-04-14T22:03:24.387850image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:24.470926image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1752408
74.7%
2 524529
 
22.4%
3 61037
 
2.6%
4 5759
 
0.2%
5 1090
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1 1752408
74.7%
2 524529
 
22.4%
3 61037
 
2.6%
4 5759
 
0.2%
5 1090
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1752408
74.7%
2 524529
 
22.4%
3 61037
 
2.6%
4 5759
 
0.2%
5 1090
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1752408
74.7%
2 524529
 
22.4%
3 61037
 
2.6%
4 5759
 
0.2%
5 1090
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1752408
74.7%
2 524529
 
22.4%
3 61037
 
2.6%
4 5759
 
0.2%
5 1090
 
< 0.1%

Q012
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
2
2170790 
3
 
134253
1
 
27741
4
 
10098
5
 
1941

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row3
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 2170790
92.6%
3 134253
 
5.7%
1 27741
 
1.2%
4 10098
 
0.4%
5 1941
 
0.1%

Length

2024-04-14T22:03:24.567013image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:24.650090image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
2 2170790
92.6%
3 134253
 
5.7%
1 27741
 
1.2%
4 10098
 
0.4%
5 1941
 
0.1%

Most occurring characters

ValueCountFrequency (%)
2 2170790
92.6%
3 134253
 
5.7%
1 27741
 
1.2%
4 10098
 
0.4%
5 1941
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 2170790
92.6%
3 134253
 
5.7%
1 27741
 
1.2%
4 10098
 
0.4%
5 1941
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 2170790
92.6%
3 134253
 
5.7%
1 27741
 
1.2%
4 10098
 
0.4%
5 1941
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 2170790
92.6%
3 134253
 
5.7%
1 27741
 
1.2%
4 10098
 
0.4%
5 1941
 
0.1%

Q013
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
1172567 
2
1070208 
3
 
88886
4
 
10781
5
 
2381

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1172567
50.0%
2 1070208
45.6%
3 88886
 
3.8%
4 10781
 
0.5%
5 2381
 
0.1%

Length

2024-04-14T22:03:24.740171image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:24.823246image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1172567
50.0%
2 1070208
45.6%
3 88886
 
3.8%
4 10781
 
0.5%
5 2381
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1 1172567
50.0%
2 1070208
45.6%
3 88886
 
3.8%
4 10781
 
0.5%
5 2381
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1172567
50.0%
2 1070208
45.6%
3 88886
 
3.8%
4 10781
 
0.5%
5 2381
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1172567
50.0%
2 1070208
45.6%
3 88886
 
3.8%
4 10781
 
0.5%
5 2381
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1172567
50.0%
2 1070208
45.6%
3 88886
 
3.8%
4 10781
 
0.5%
5 2381
 
0.1%

Q014
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
2
1530846 
1
782921 
3
 
29821
4
 
1032
5
 
203

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
2 1530846
65.3%
1 782921
33.4%
3 29821
 
1.3%
4 1032
 
< 0.1%
5 203
 
< 0.1%

Length

2024-04-14T22:03:24.925339image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:25.024430image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
2 1530846
65.3%
1 782921
33.4%
3 29821
 
1.3%
4 1032
 
< 0.1%
5 203
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
2 1530846
65.3%
1 782921
33.4%
3 29821
 
1.3%
4 1032
 
< 0.1%
5 203
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 1530846
65.3%
1 782921
33.4%
3 29821
 
1.3%
4 1032
 
< 0.1%
5 203
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 1530846
65.3%
1 782921
33.4%
3 29821
 
1.3%
4 1032
 
< 0.1%
5 203
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 1530846
65.3%
1 782921
33.4%
3 29821
 
1.3%
4 1032
 
< 0.1%
5 203
 
< 0.1%

Q015
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
2010385 
2
330081 
3
 
4012
4
 
236
5
 
109

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 2010385
85.7%
2 330081
 
14.1%
3 4012
 
0.2%
4 236
 
< 0.1%
5 109
 
< 0.1%

Length

2024-04-14T22:03:25.112508image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:25.190580image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 2010385
85.7%
2 330081
 
14.1%
3 4012
 
0.2%
4 236
 
< 0.1%
5 109
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1 2010385
85.7%
2 330081
 
14.1%
3 4012
 
0.2%
4 236
 
< 0.1%
5 109
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 2010385
85.7%
2 330081
 
14.1%
3 4012
 
0.2%
4 236
 
< 0.1%
5 109
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 2010385
85.7%
2 330081
 
14.1%
3 4012
 
0.2%
4 236
 
< 0.1%
5 109
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 2010385
85.7%
2 330081
 
14.1%
3 4012
 
0.2%
4 236
 
< 0.1%
5 109
 
< 0.1%

Q016
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
2
1239159 
1
1085694 
3
 
18910
4
 
821
5
 
239

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
2 1239159
52.8%
1 1085694
46.3%
3 18910
 
0.8%
4 821
 
< 0.1%
5 239
 
< 0.1%

Length

2024-04-14T22:03:25.276659image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:25.357732image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
2 1239159
52.8%
1 1085694
46.3%
3 18910
 
0.8%
4 821
 
< 0.1%
5 239
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
2 1239159
52.8%
1 1085694
46.3%
3 18910
 
0.8%
4 821
 
< 0.1%
5 239
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 1239159
52.8%
1 1085694
46.3%
3 18910
 
0.8%
4 821
 
< 0.1%
5 239
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 1239159
52.8%
1 1085694
46.3%
3 18910
 
0.8%
4 821
 
< 0.1%
5 239
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 1239159
52.8%
1 1085694
46.3%
3 18910
 
0.8%
4 821
 
< 0.1%
5 239
 
< 0.1%

Q017
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
2254762 
2
 
88434
3
 
1382
4
 
137
5
 
108

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 2254762
96.2%
2 88434
 
3.8%
3 1382
 
0.1%
4 137
 
< 0.1%
5 108
 
< 0.1%

Length

2024-04-14T22:03:25.441808image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:25.520880image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 2254762
96.2%
2 88434
 
3.8%
3 1382
 
0.1%
4 137
 
< 0.1%
5 108
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1 2254762
96.2%
2 88434
 
3.8%
3 1382
 
0.1%
4 137
 
< 0.1%
5 108
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 2254762
96.2%
2 88434
 
3.8%
3 1382
 
0.1%
4 137
 
< 0.1%
5 108
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 2254762
96.2%
2 88434
 
3.8%
3 1382
 
0.1%
4 137
 
< 0.1%
5 108
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 2254762
96.2%
2 88434
 
3.8%
3 1382
 
0.1%
4 137
 
< 0.1%
5 108
 
< 0.1%

Q018
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
1695096 
2
649727 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1695096
72.3%
2 649727
 
27.7%

Length

2024-04-14T22:03:25.602954image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:25.672017image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1695096
72.3%
2 649727
 
27.7%

Most occurring characters

ValueCountFrequency (%)
1 1695096
72.3%
2 649727
 
27.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1695096
72.3%
2 649727
 
27.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1695096
72.3%
2 649727
 
27.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1695096
72.3%
2 649727
 
27.7%

Q019
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
2
1458493 
3
480748 
4
188222 
1
 
123290
5
 
94070

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row3
3rd row3
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 1458493
62.2%
3 480748
 
20.5%
4 188222
 
8.0%
1 123290
 
5.3%
5 94070
 
4.0%

Length

2024-04-14T22:03:25.749087image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:25.826157image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
2 1458493
62.2%
3 480748
 
20.5%
4 188222
 
8.0%
1 123290
 
5.3%
5 94070
 
4.0%

Most occurring characters

ValueCountFrequency (%)
2 1458493
62.2%
3 480748
 
20.5%
4 188222
 
8.0%
1 123290
 
5.3%
5 94070
 
4.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 1458493
62.2%
3 480748
 
20.5%
4 188222
 
8.0%
1 123290
 
5.3%
5 94070
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 1458493
62.2%
3 480748
 
20.5%
4 188222
 
8.0%
1 123290
 
5.3%
5 94070
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 1458493
62.2%
3 480748
 
20.5%
4 188222
 
8.0%
1 123290
 
5.3%
5 94070
 
4.0%

Q020
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
1912069 
2
432754 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1912069
81.5%
2 432754
 
18.5%

Length

2024-04-14T22:03:25.914237image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:25.984301image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1912069
81.5%
2 432754
 
18.5%

Most occurring characters

ValueCountFrequency (%)
1 1912069
81.5%
2 432754
 
18.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1912069
81.5%
2 432754
 
18.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1912069
81.5%
2 432754
 
18.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1912069
81.5%
2 432754
 
18.5%

Q021
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
1758480 
2
586343 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1758480
75.0%
2 586343
 
25.0%

Length

2024-04-14T22:03:26.064374image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:26.134437image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 1758480
75.0%
2 586343
 
25.0%

Most occurring characters

ValueCountFrequency (%)
1 1758480
75.0%
2 586343
 
25.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 1758480
75.0%
2 586343
 
25.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 1758480
75.0%
2 586343
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 1758480
75.0%
2 586343
 
25.0%

Q022
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
4
757635 
5
601948 
3
599482 
2
337864 
1
 
47894

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row5
4th row2
5th row3

Common Values

ValueCountFrequency (%)
4 757635
32.3%
5 601948
25.7%
3 599482
25.6%
2 337864
14.4%
1 47894
 
2.0%

Length

2024-04-14T22:03:26.214510image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:26.299587image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
4 757635
32.3%
5 601948
25.7%
3 599482
25.6%
2 337864
14.4%
1 47894
 
2.0%

Most occurring characters

ValueCountFrequency (%)
4 757635
32.3%
5 601948
25.7%
3 599482
25.6%
2 337864
14.4%
1 47894
 
2.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 757635
32.3%
5 601948
25.7%
3 599482
25.6%
2 337864
14.4%
1 47894
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 757635
32.3%
5 601948
25.7%
3 599482
25.6%
2 337864
14.4%
1 47894
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 757635
32.3%
5 601948
25.7%
3 599482
25.6%
2 337864
14.4%
1 47894
 
2.0%

Q023
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
2023620 
2
321203 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 2023620
86.3%
2 321203
 
13.7%

Length

2024-04-14T22:03:26.388668image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:26.457731image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 2023620
86.3%
2 321203
 
13.7%

Most occurring characters

ValueCountFrequency (%)
1 2023620
86.3%
2 321203
 
13.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 2023620
86.3%
2 321203
 
13.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 2023620
86.3%
2 321203
 
13.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 2023620
86.3%
2 321203
 
13.7%

Q024
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
1
988546 
2
932055 
3
267466 
4
104512 
5
 
52244

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 988546
42.2%
2 932055
39.7%
3 267466
 
11.4%
4 104512
 
4.5%
5 52244
 
2.2%

Length

2024-04-14T22:03:26.534801image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:26.617877image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
1 988546
42.2%
2 932055
39.7%
3 267466
 
11.4%
4 104512
 
4.5%
5 52244
 
2.2%

Most occurring characters

ValueCountFrequency (%)
1 988546
42.2%
2 932055
39.7%
3 267466
 
11.4%
4 104512
 
4.5%
5 52244
 
2.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 988546
42.2%
2 932055
39.7%
3 267466
 
11.4%
4 104512
 
4.5%
5 52244
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 988546
42.2%
2 932055
39.7%
3 267466
 
11.4%
4 104512
 
4.5%
5 52244
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 988546
42.2%
2 932055
39.7%
3 267466
 
11.4%
4 104512
 
4.5%
5 52244
 
2.2%

Q025
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size17.9 MiB
2
2156270 
1
 
188553

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2344823
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 2156270
92.0%
1 188553
 
8.0%

Length

2024-04-14T22:03:26.708960image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T22:03:26.779023image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
2 2156270
92.0%
1 188553
 
8.0%

Most occurring characters

ValueCountFrequency (%)
2 2156270
92.0%
1 188553
 
8.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 2156270
92.0%
1 188553
 
8.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 2156270
92.0%
1 188553
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2344823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 2156270
92.0%
1 188553
 
8.0%

NU_NOTA_MEDIA
Real number (ℝ)

Distinct50309
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean543.48471
Minimum0
Maximum855.98
Zeros22
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size17.9 MiB
2024-04-14T22:03:26.872108image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile402.08
Q1484.52
median540.54
Q3602.06
95-th percentile693.14
Maximum855.98
Range855.98
Interquartile range (IQR)117.54

Descriptive statistics

Standard deviation88.04023
Coefficient of variation (CV)0.1619921
Kurtosis-0.016444533
Mean543.48471
Median Absolute Deviation (MAD)58.58
Skewness0.034281316
Sum1.2743754 × 109
Variance7751.0822
MonotonicityNot monotonic
2024-04-14T22:03:26.983208image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
526.72 261
 
< 0.1%
557.2 259
 
< 0.1%
534.74 255
 
< 0.1%
531.5 254
 
< 0.1%
530.78 253
 
< 0.1%
513.4 251
 
< 0.1%
514.72 249
 
< 0.1%
529.5 249
 
< 0.1%
512.76 247
 
< 0.1%
530.6 246
 
< 0.1%
Other values (50299) 2342299
99.9%
ValueCountFrequency (%)
0 22
< 0.1%
56.14 1
 
< 0.1%
64 1
 
< 0.1%
66.1 1
 
< 0.1%
69.8 1
 
< 0.1%
72.12 1
 
< 0.1%
80 1
 
< 0.1%
82.28 1
 
< 0.1%
89.12 1
 
< 0.1%
92 1
 
< 0.1%
ValueCountFrequency (%)
855.98 1
< 0.1%
855.82 1
< 0.1%
851.84 1
< 0.1%
849.86 1
< 0.1%
848.32 1
< 0.1%
843.5 1
< 0.1%
842.02 1
< 0.1%
841.98 1
< 0.1%
841.76 1
< 0.1%
841.1 1
< 0.1%

Interactions

2024-04-14T22:03:05.087836image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:39.323456image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:43.057482image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:46.823502image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:50.486468image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:54.193836image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:57.861982image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:01.424859image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:05.541248image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:39.829915image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:43.515899image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:47.281420image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:50.938879image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:54.658258image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:58.314395image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:01.895289image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:05.992657image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:40.245292image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:44.019357image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:47.744842image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:51.383283image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:55.094655image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:58.758801image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:02.336689image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:06.447071image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:40.701664image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:44.462759image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:48.201257image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:51.811672image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:55.557075image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:59.190194image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:02.818127image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:06.898481image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:41.136607image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:44.959753image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:48.688328image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:52.284101image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:55.993471image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:59.637797image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:03.282549image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:07.343886image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:41.581011image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:45.405158image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:49.141741image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:52.766540image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:56.455892image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:00.068918image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:03.738964image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:07.802303image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:42.068453image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:45.880141image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:49.584143image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:53.225957image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:56.929099image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:00.524971image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:04.174360image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:08.218681image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:42.564035image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:46.329549image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:50.039557image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:53.737421image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:02:57.406541image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:00.983822image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-04-14T22:03:04.651793image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Correlations

2024-04-14T22:03:27.088304image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
IN_TREINEIRONU_NOTA_MEDIAQ001Q002Q003Q004Q005Q006Q007Q008Q009Q010Q011Q012Q013Q014Q015Q016Q017Q018Q019Q020Q021Q022Q023Q024Q025TP_COR_RACATP_ESCOLATP_ESTADO_CIVILTP_FAIXA_ETARIATP_LINGUATP_SEXOTP_ST_CONCLUSAO
IN_TREINEIRO1.0000.0120.1160.1600.0870.1130.0980.1590.1170.1630.1250.1550.0280.0680.0850.0810.0440.0930.0620.1070.1310.0690.1130.1020.0440.1300.052-0.0680.3860.084-0.5820.1090.0371.000
NU_NOTA_MEDIA0.0121.0000.2560.3150.2480.2780.0390.4630.1250.1790.1100.1620.0340.0650.1300.1220.0590.1290.0770.2890.1480.1290.1970.1250.1640.2210.188-0.2280.1920.038-0.0730.2770.0600.064
Q0010.1160.2561.0000.5010.5020.3570.0570.3750.1830.2400.1540.2140.0460.0960.1400.1700.0900.1610.1200.3660.1970.1740.3120.1660.2020.2620.211-0.1600.2040.077-0.2180.2600.0660.124
Q0020.1600.3150.5011.0000.3490.5190.0560.4660.1660.2160.1490.2030.0220.0860.1250.1560.0770.1470.0940.3170.1730.1350.2780.1540.1660.2360.212-0.1770.1910.098-0.2830.2360.0660.134
Q0030.0870.2480.5020.3491.0000.4970.0630.3870.2290.2760.1700.2400.0750.1150.1570.1910.1020.1820.1370.3850.2270.1870.3430.1790.2160.2720.237-0.1650.2190.043-0.1520.2670.0630.103
Q0040.1130.2780.3570.5190.4971.0000.0560.4640.2270.2560.1590.2240.0620.1060.1470.1910.0940.1760.1240.3510.2070.1630.3190.1680.1960.2590.244-0.1830.2010.048-0.1940.2490.0650.107
Q0050.0980.0390.0570.0560.0630.0561.0000.0490.0470.0830.1950.1180.0580.0630.0590.0690.0460.0730.0450.1280.1070.0740.0960.3320.0710.0910.1000.0650.0970.064-0.1110.0790.0360.110
Q0060.1590.4630.3750.4660.3870.4640.0491.0000.3070.3660.2490.3490.0510.1610.2190.2570.1410.2420.1800.5280.3100.2530.4500.2600.2790.3750.308-0.2900.2520.021-0.2170.2990.0980.111
Q0070.1170.1250.1830.1660.2290.2270.0470.3071.0000.2880.1810.2180.0330.1450.1450.1030.1020.1260.1580.2490.2230.1610.2770.1110.1480.2200.066-0.1350.1590.017-0.1410.1400.0360.076
Q0080.1630.1790.2400.2160.2760.2560.0830.3660.2881.0000.3430.3060.0420.2260.2120.2050.1210.2080.1680.4440.3180.2370.3820.2240.2450.3010.247-0.2220.2130.027-0.2170.2360.0620.107
Q0090.1250.1100.1540.1490.1700.1590.1950.2490.1810.3431.0000.2420.0660.1850.1680.1670.0930.1590.0960.3050.2400.1810.2790.2710.1720.2140.262-0.1530.1210.051-0.2010.1530.0430.095
Q0100.1550.1620.2140.2030.2400.2240.1180.3490.2180.3060.2421.0000.0550.1770.2110.2400.1280.2310.1310.4790.2670.2110.3590.2400.2390.2810.260-0.2620.1750.024-0.2470.2330.0600.113
Q0110.0280.0340.0460.0220.0750.0620.0580.0510.0330.0420.0660.0551.0000.0430.0390.0640.0790.0580.0750.0610.0380.0340.0490.0630.0680.0390.0550.0540.0530.012-0.0490.0720.0330.028
Q0120.0680.0650.0960.0860.1150.1060.0630.1610.1450.2260.1850.1770.0431.0000.3100.1550.1110.1870.1160.2320.2070.1590.2070.1340.1400.1450.184-0.1120.0860.015-0.1080.1040.0390.052
Q0130.0850.1300.1400.1250.1570.1470.0590.2190.1450.2120.1680.2110.0390.3101.0000.2100.1590.2150.1140.3620.2130.2100.2870.1770.1860.1990.210-0.2020.1120.041-0.1810.1990.0330.077
Q0140.0810.1220.1700.1560.1910.1910.0690.2570.1030.2050.1670.2400.0640.1550.2101.0000.3200.2920.2070.3910.2040.1630.2950.1960.2070.2250.288-0.2350.1180.012-0.1590.2170.0640.073
Q0150.0440.0590.0900.0770.1020.0940.0460.1410.1020.1210.0930.1280.0790.1110.1590.3201.0000.2360.3280.2490.1210.1410.2010.0950.1000.1290.098-0.1110.0740.015-0.0970.0960.0290.046
Q0160.0930.1290.1610.1470.1820.1760.0730.2420.1260.2080.1590.2310.0580.1870.2150.2920.2361.0000.2520.4100.2250.1890.3000.1730.2120.2210.261-0.2420.1300.016-0.1640.2200.0590.075
Q0170.0620.0770.1200.0940.1370.1240.0450.1800.1580.1680.0960.1310.0750.1160.1140.2070.3280.2521.0000.2340.1300.1440.1830.0670.1250.1520.053-0.1130.1080.007-0.0840.1060.0400.042
Q0180.1070.2890.3660.3170.3850.3510.1280.5280.2490.4440.3050.4790.0610.2320.3620.3910.2490.4100.2341.0000.4600.2500.3370.3250.2400.4900.171-0.2590.2210.029-0.1720.2360.0540.139
Q0190.1310.1480.1970.1730.2270.2070.1070.3100.2230.3180.2400.2670.0380.2070.2130.2040.1210.2250.1300.4601.0000.2790.4130.2340.2630.2820.231-0.2190.1910.032-0.1970.2310.0840.095
Q0200.0690.1290.1740.1350.1870.1630.0740.2530.1610.2370.1810.2110.0340.1590.2100.1630.1410.1890.1440.2500.2791.0000.2110.1920.1730.2670.079-0.1000.1090.033-0.1070.1090.0200.093
Q0210.1130.1970.3120.2780.3430.3190.0960.4500.2770.3820.2790.3590.0490.2070.2870.2950.2010.3000.1830.3370.4130.2111.0000.2920.2290.3820.151-0.1620.2060.041-0.1750.1680.0230.148
Q0220.1020.1250.1660.1540.1790.1680.3320.2600.1110.2240.2710.2400.0630.1340.1770.1960.0950.1730.0670.3250.2340.1920.2921.0000.1640.2300.348-0.1510.1070.070-0.2120.1870.0450.092
Q0230.0440.1640.2020.1660.2160.1960.0710.2790.1480.2450.1720.2390.0680.1400.1860.2070.1000.2120.1250.2400.2630.1730.2290.1641.0000.2710.105-0.1230.1340.024-0.0650.1330.0330.050
Q0240.1300.2210.2620.2360.2720.2590.0910.3750.2200.3010.2140.2810.0390.1450.1990.2250.1290.2210.1520.4900.2820.2670.3820.2300.2711.0000.300-0.2600.2070.017-0.1360.2800.0990.081
Q0250.0520.1880.2110.2120.2370.2440.1000.3080.0660.2470.2620.2600.0550.1840.2100.2880.0980.2610.0530.1710.2310.0790.1510.3480.1050.3001.000-0.1280.0840.017-0.0940.1290.0330.062
TP_COR_RACA-0.068-0.228-0.160-0.177-0.165-0.1830.065-0.290-0.135-0.222-0.153-0.2620.054-0.112-0.202-0.235-0.111-0.242-0.113-0.259-0.219-0.100-0.162-0.151-0.123-0.260-0.1281.0000.1110.0450.1100.1800.0190.066
TP_ESCOLA0.3860.1920.2040.1910.2190.2010.0970.2520.1590.2130.1210.1750.0530.0860.1120.1180.0740.1300.1080.2210.1910.1090.2060.1070.1340.2070.0840.1111.0000.096-0.3210.1340.0500.707
TP_ESTADO_CIVIL0.0840.0380.0770.0980.0430.0480.0640.0210.0170.0270.0510.0240.0120.0150.0410.0120.0150.0160.0070.0290.0320.0330.0410.0700.0240.0170.0170.0450.0961.0000.1900.0850.0180.116
TP_FAIXA_ETARIA-0.582-0.073-0.218-0.283-0.152-0.194-0.111-0.217-0.141-0.217-0.201-0.247-0.049-0.108-0.181-0.159-0.097-0.164-0.084-0.172-0.197-0.107-0.175-0.212-0.065-0.136-0.0940.110-0.3210.1901.0000.1770.0320.498
TP_LINGUA0.1090.2770.2600.2360.2670.2490.0790.2990.1400.2360.1530.2330.0720.1040.1990.2170.0960.2200.1060.2360.2310.1090.1680.1870.1330.2800.1290.1800.1340.0850.1771.0000.0980.136
TP_SEXO0.0370.0600.0660.0660.0630.0650.0360.0980.0360.0620.0430.0600.0330.0390.0330.0640.0290.0590.0400.0540.0840.0200.0230.0450.0330.0990.0330.0190.0500.0180.0320.0981.0000.042
TP_ST_CONCLUSAO1.0000.0640.1240.1340.1030.1070.1100.1110.0760.1070.0950.1130.0280.0520.0770.0730.0460.0750.0420.1390.0950.0930.1480.0920.0500.0810.0620.0660.7070.1160.4980.1360.0421.000

Missing values

2024-04-14T22:03:09.089162image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-14T22:03:12.972829image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

TP_FAIXA_ETARIATP_SEXOTP_ESTADO_CIVILTP_COR_RACATP_ST_CONCLUSAOTP_ESCOLAIN_TREINEIROTP_LINGUAQ001Q002Q003Q004Q005Q006Q007Q008Q009Q010Q011Q012Q013Q014Q015Q016Q017Q018Q019Q020Q021Q022Q023Q024Q025NU_NOTA_MEDIA
0501211015614221231122111111113112558.24
1611311013112311341121212113113222394.62
2601211015522521232132112113115112414.10
3401311015522221221121111112112112438.10
4201123015521421231221111112113112576.70
5201331107666221232121212112113222530.58
6801211012614621252221211112114112645.80
7101331108532621231121211112114122378.74
8401111012442221332122212112213122500.40
9411311015522351332122211113112112605.58
TP_FAIXA_ETARIATP_SEXOTP_ESTADO_CIVILTP_COR_RACATP_ST_CONCLUSAOTP_ESCOLAIN_TREINEIROTP_LINGUAQ001Q002Q003Q004Q005Q006Q007Q008Q009Q010Q011Q012Q013Q014Q015Q016Q017Q018Q019Q020Q021Q022Q023Q024Q025NU_NOTA_MEDIA
23448131211311003462371232122212125224232599.36
2344814211122004533361341223212123214122526.38
2344815611011002511541251121211112113112533.66
2344816310211005562521331122221112214122467.20
2344817301122005634461231122211112115122515.02
23448181212111015533431222221212112123112488.40
23448191101211018363141221122111112112122617.92
2344820210322008532221221122211112113112541.22
23448211101111003222321232121111122213111507.22
2344822211122005332471342221212123215122607.06

Duplicate rows

Most frequently occurring

TP_FAIXA_ETARIATP_SEXOTP_ESTADO_CIVILTP_COR_RACATP_ST_CONCLUSAOTP_ESCOLAIN_TREINEIROTP_LINGUAQ001Q002Q003Q004Q005Q006Q007Q008Q009Q010Q011Q012Q013Q014Q015Q016Q017Q018Q019Q020Q021Q022Q023Q024Q025NU_NOTA_MEDIA# duplicates
0201322012211321231121111112112112495.842
1301322012211421231121111112113112498.882
2401311011522421221121111112113112578.242
3401311012211321231121111112112112461.122
4401311012211421231221111112112112561.402
5401311013322321231121111112113112478.302